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DETAILED ACTION 

Election/Restrictions 

Applicants' election without traverse of Group I, Claims 1 to 15, 20, 23, and 24 in 
the reply filed on 1 1 May 2005 is acknowledged. 

Claims 16 to 19 and 21 to 22 are withdrawn from further consideration pursuant 
to 37 CFR 1.142(b) as being drawn to a nonelected invention, there being no allowable 
generic or linking claim. Election was made without traverse in the reply filed on 1 1 
May 2005. 

Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

Claims 23 and 24 are rejected under 35 U.S.C. 102(a) as being anticipated by 

Ferrell. 

Regarding independent claim 23, Ferrell discloses an interactive speech and 
language training system, comprising: 

"means for converting input text to audible speech in a selected language, the 
audible speech being patterned after a model" - speech synthesizer 74 forms an audio 
representation of the vocabulary elements; vocabulary library 68 includes recorded 
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digitized representations of vocabulary elements ("models") (column 7, lines 40 to 45: 
Figure 3); a vocabulary element, such as a word or phrase is presented both visually 
and aurally to the individual in a native language or a non-native language ("in a 
selected language") (column 4, lines 33 to 57: Figures 1 and 4); 

"means for receiving utterances spoken by a user in response to a prompt to 
replicate the audible speech" - a vocabulary element is presented both visually and 
aurally to the individual ("a prompt"), and the individual is given a period of time to 
initiate a response; the user's response is received; for example, the user may 
pronounce the vocabulary element (column 4, line 44 to column 5, line 10: Figure 1 : 
Steps 12 to 14; Figure 4); 

"means for recognizing the utterances and provide feedback to the user on each 
sub-word or phoneme portion of the utterances, the feedback being comprised of a 
confidence measure reflecting a precision at which the user replicates the audible 
speech in the selected language based on a comparison of the utterances to one of the 
audible speech and the model, wherein the confidence measure is provided as scores 
for replication of at least one of paragraphs, sentences, words, sub-words or phonemes" 
- the responses are evaluated for correctness and appropriate feedback is presented to 
the user based on the correctness of the response; in the preferred embodiment, the 
feedback includes both visual and aural feedback; visual feedback is provided by a 
needle gauge at the bottom of the screen which indicates the degree of correct 
pronunciation ("confidence measure")(column 5, lines 8 to 25: Figure 1; Steps 18 and 
20); icon 84 provides visual feedback in the form of a confidence meter which indicates 
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the correctness of a user response (column 8, lines 1 to 3: Figure 4); implicitly, a meter 
having a needle gauge reflects a numeric "score"; a vocabulary element may be a 
phoneme, word, phrase, sentence, or paragraph ("at least one of paragraphs, 
sentences, words and sub-words") (column 4, lines 44 to 51); visual and aural feedback 
is provided for each vocabulary element, and each vocabulary element can be a 
phoneme, which is also a sub-word portion. 

Regarding claim 24, Ferrell discloses appropriate feedback is presented to the 
user based on the correctness of the response; in the preferred embodiment, both 
visual and aural feedback is provided; aural feedback includes a synthesized voice 
which speaks the user's name along with an encouraging response (column 5, lines 7 to 
18: Figure 1). 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 1, 5 to 7, 9, 1 1 to 14, and 20 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Ferrell in view of Mostow et al. 

Concerning independent claim 1, Ferrell discloses an interactive speech and 
language training system, comprising: 
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"a first module configured to [receive repurposed input text from a repurposed 
source and] convert the input text to audible speech in a selected language, the audible 
speech being patterned after a model" - speech synthesizer 74 forms an audio 
representation of the vocabulary elements; vocabulary library 68 includes recorded 
digitized representations of vocabulary elements ("models") (column 7, lines 40 to 45: 
Figure 3); a vocabulary element, such as a word or phrase is presented both visually 
and aurally to the individual in a native language or a non-native language ("in a 
selected language") (column 4, lines 33 to 57: Figures 1 and 4); 

"a user interface configured to receive utterances spoken by a user in response 
to a prompt to replicate the audible speech" - a vocabulary element is presented both 
visually and aurally to the individual ("a prompt"), and the individual is given a period of 
time to initiate a response; the user's response is received; for example, the user may 
pronounce the vocabulary element (column 4, line 44 to column 5, line 10: Figure 1: 
Steps 12 to 14; Figure 4); 

"a second module configured to recognize the utterances and provide feedback 
to the user, the feedback being comprised of a confidence measure reflecting a 
precision at which the user replicates the audible speech in the selected language 
based on a comparison of the utterances to one of the audible speech and the model, 
wherein the confidence measure is provided as scores for replication of at least one of 
paragraphs, sentences, words and sub-words" - the responses are evaluated for 
correctness and appropriate feedback is presented to the user based on the 
correctness of the response; in the preferred embodiment, the feedback includes both 
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visual and aural feedback; visual feedback is provided by a needle gauge at the bottom 
of the screen which indicates the degree of correct pronunciation ("confidence 
measure")(column 5, lines 8 to 25: Figure 1; Steps 18 and 20); icon 84 provides visual 
feedback in the form of a confidence meter which indicates the correctness of a user 
response (column 8, lines 1 to 3: Figure 4); a vocabulary element may be a phoneme, 
word, phrase, sentence, or paragraph ("at least one of paragraphs, sentences, words 
and sub-words") (column 4, lines 44 to 51); implicitly, a meter having a needle gauge 
reflects a numeric "score". 

Concerning independent claim 1 , the only element not expressly disclosed by 
Ferrell is that the input text is a "repurposed input text from a repurposed source". 
However, Mostow et al. teaches a related reading and pronunciation tutor involving 
speech recognition, where an external application such as a tutor for another domain, 
may dynamically supply text for the tutor to help the user to read. The content of the 
input text may be input from any of several sources by any of several processes. Text 
and resources may be imported from a pre-existing source directly into a knowledge 
base. An external application, such as a tutor from another domain, may dynamically 
supply text for the tutor to help the user to read. (Column 8, Lines 51 to 61 : Figure 1 ) 
The objective is to enable content to be created by operating the tutor in an authoring 
mode or during normal tutoring operations, thereby eliminating the time and expense of 
having to prepare a separate tutor for each story or group of stories. (Column 2, Lines 
32 to 37) It would have been obvious to supply repurposed input text from a 
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repurposed source as suggested by Mostow et al. in the interactive language instruction 
system of Ferrell for the purpose of saving time and expense of lesson preparation. 

Concerning claims 5 and 6, Ferrell discloses vocabulary library 68 ("files for 
storing model pronunciations") includes digital representations of vocabulary elements 
(column 7, lines 40 to 45: Figure 3); a vocabulary element may be a phoneme, word, 
sentence, or paragraph (column 4, lines 45 to 49), and is "a predictive model" in that a 
vocabulary element predicts what phoneme, word, sentence, or paragraph was spoken. 

Concerning claim 7, Ferrell discloses the presentation is divided into multiple 
lessons incorporating new vocabulary elements (column 4, lines 55 to 57; column 5, 
lines 26 to 36: Figure 2). 

Concerning claim 9, Ferrell discloses unfamiliar vocabulary elements are 
introduced with a definition ("dictionary files")(column 5, lines 33 to 36: Figure 2). 

Concerning claims 11 to 13, Ferrell omits tables storing mapping data between 
word subgroups and vocabulary words, between words and vocabulary words, and 
between words and examples of parts of speech. However, Mostow et al. teaches a 
related reading and pronunciation tutor where an automatic enhancement function 
includes a heuristic algorithm using tables. Lookup of information in tables identifies 
sets of words that rhyme with one another, words that look alike, start or end the same 
etc., by constructing a key for each word that says what set is that word's equivalence 
class. The word may also be decomposed into its root word and affixes, which implicitly 
involves identification of the word's part of speech (Column 9, Line 52 to Column 10, 
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Line 33) It would have been obvious to one of ordinary skill in the art to include tables 
of related words as taught by Mostow et al. in the interactive language instruction 
system of Ferrell for the purpose of inferring the pronunciation of words not found in a 
dictionary. 

Concerning claim 14, Ferrell omits tables of punctuation, but Mostow et al. 
teaches that the tutoring function takes account of phrase boundaries as indicated by 
commas and certain other punctuation for the purpose of more accurately aligning 
recognition results against the text. (Column 5, Lines 1 1 to 22) It would have been 
obvious to one of ordinary skill in the art to include a table of punctuation indicating 
phrase boundaries in the interactive language instruction system of Ferrell for the 
purpose of more accurately aligning recognition results against the text as taught by 
Mostow et al. 

Concerning claim 20, Ferrell discloses icon 84 provides visual feedback in the 
form of a confidence meter, which indicates the correctness of a user response (column 
8, lines 1 to 3: Figure 4); visual feedback is provided by a needle gauge at the bottom of 
the screen (column 5, lines 1 1 to 15); icon 84 provides visual feedback in the form of a 
confidence meter (column 8, lines 1 to 4); confidence meter is an "icon"; aural feedback 
includes a synthesized voice which speaks the user's name along with an encouraging 
response such as "Ron, that's close, let's try again." ("an audio segment") (column 5, 
lines 14 to 18). 
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Claims 2 to 4 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Ferrell in view of Mostow et a/, as applied to claim 1 above, and further in view of 
Hen ton. 

Concerning claim 2, Ferrell discloses visually displayed vocabulary elements 
(Figure 4), but omits: 

(, a third module synchronized to the first module for producing a visual 
pronunciation aid in the form of an animated image of a human face and head 
pronouncing the audible speech." However, Henton teaches a method and apparatus 
for synthetic speech with an animated face, suggesting that it is well known to 
synchronize imaging of a face with synthetic speech for the purpose of instructing the 
user. (Column 3, Lines 33 to 49: Figure 3) It would have been obvious to one of 
ordinary skill in the art to include an animated face module as suggested by Henton in 
the multimodal interactive speech and language training system of Ferrell to 
synchronize an image with synthetic speech for the purpose of instructing a user. 

Concerning claim 3, Henton teaches a face and head, which is a "transparent" 
line drawing of a human face and head (Figure 3). 

Concerning claim 4, Henton teaches a voice table block is utilized by synthesizer 
to provide all needed phones or use aliases for any needed missing phones (column 5, 
lines 42 to 52: Figure 2); supplying phones for speech synthesis involves controlling one 
of "the vocal characteristics of the audible speech", i.e. how a phone is pronounced. 
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Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over Ferrell in 
view of Mostow et al. as applied to claim 1 above, and further in view of Doi et al. 

Ferrell discloses language instruction, but omits a mapping of sub-words in a 
first language to sub-words in a second language for illustrating sound alike 
comparisons to the student. However, Doi et a/, teaches a machine translation system, 
where data display control selects translation possibilities by dividing an original 
sentence into words as data A and providing a translated sentence with translated 
words as data B. (Column 7, Lines 13 to 66) The objective is to display the class of 
data to be selected in order to greatly simplify the selection of translation possibilities. 
(Column 2, Lines 14 to 49) It would have been obvious to one having ordinary skill in 
the art to provide a mapping of sub-words between first and second languages as 
suggested by Doiet al. in the multimodal interactive speech and language training 
system of Ferrell for the purpose of simplifying selection of translation possibilities. 

Claim 10 is rejected under 35 U.S.C. 103(a) as being unpatentable over Ferrell in 
view of Mostow et al. as applied to claim 1 above, and further in view of Adams, Jr. et 
al. 

Ferrell omits a record and playback module for providing playback of selected 
portions of audible speech and utterances from the user. However, Adams, Jr. et al. 
teaches a related system and method for interactive reading and language instruction 
including a session database for replay and resumption containing all the information 
necessary to provide a replay of the joint reading of the text by the companion and the 
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student. (Column 4, Lines 17 to 29: Figure 2) Throughout the lesson the audio inputs 
from both the student and the computer instructor, along with the text as displayed for 
utterance by each party, are stored at the session database. Adams, Jr. et a/, suggests 
that this enhances the learning experience by identifying areas for concentrated effort in 
the future. (Column 7, Lines 45 to 51 ) It would have been obvious to one of ordinary 
skill in the art to include a record and playback module in the system and method for 
interactive language training of Ferrell as suggested by Adams, Jr. et ai for the purpose 
of enhancing the lesson learning experience by identifying areas for concentrated effort. 

Response to Arguments 

Applicants' arguments submitted 15 September 2004 have been considered but 
are moot in view of the new grounds of rejection, necessitated by amendment. 

Allowable Subject Matter 

Claim 15 is objected to as being dependent upon a rejected base claim, but 
would be allowable if rewritten in independent form including all of the limitations of the 
base claim and any intervening claims. 

Conclusion 

Applicants' amendment necessitated the new grounds f rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
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§ 706.07(a). Applicants are reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (571 ) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



ML 

July 7, 2005 




Examiner 

Group Art Unit 2654 



